Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Indian Journal of Science and Technology

Year

2016
2017

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Paramasivam, Ilango

Anonymization in PPDM based on Data Distributions and Attribute Relations

Abstract Views :196 | PDF Views:0

Authors

Jitendra Kumar Jaiswal ¹, Rita Samikannu ¹, Ilango Paramasivam ²

Affiliations
1 School of Advanced Sciences, VIT University, Vellore - 632014, Tamil Nadu, IN
2 School of Computer Science and Engineering, VIT University, Vellore - 632014, Tamil Nadu, IN

Source

Indian Journal of Science and Technology, Vol 9, No 37 (2016), Pagination:

Abstract

Objectives: Privacy Preserving Data Mining techniques deal with the secure data publication or communication without revealing the private and sensitive information about any individual. Anonymization technique has been considered as one of the most effective techniques since it can provide better tradeoff between data utility and privacy preservation. Methods/Statistical Analysis: Existing anonymization techniques works on individual attributes and their cardinalities and they do not consider the relations among different attributes of the data. In this paper we have considered auxiliary information and entropy and mutual information to calculate distribution of entities in an attribute and relations among different attributes respectively. Based on these calculations we shall be analyzing the best generalization level for data anonymization. Findings: An adverse user can analyze the data with numerous possible perspectives viz. auxiliary information, theoretical and manual data analysis and try to exploit the data vulnerability, so improved data privacy can be achieved if we could also see with the adversary eyes. Applications/Improvements: Different other techniques can be applied to find distribution and relations on the basis of data background and its area of application.

Keywords

Auxiliary Information, Data Anonymization, Entropy, Mutual Information, Privacy Preserving Data Mining (PPDM).

Full Text

A Study of Impact on Missing Categorical Data - A Qualitative Review

A Study on Impact of Dimensionality Reduction on Naïve Bayes Classifier

Abstract Views :207 | PDF Views:0

Authors

Priya Mohan ¹, Ilango Paramasivam ²

Affiliations
1 Department of Computer Science, Bharathiar University, Coimbatore – 641046, Tamilnadu, IN
2 School of Computing Science & Engineering, VIT University, Vellore – 632014, Tamilnadu, IN

Source

Indian Journal of Science and Technology, Vol 10, No 20 (2017), Pagination:

Abstract

Objectives: The time complexity of the machine learning algorithm is directly proportionate to the dimension of the dataset. In this paper, he impacts of dimensionality of the dataset on the machine learning algorithm, Naïve-Bayes Classifier is evaluated with all feature subsets to analyze whether there is any variations in the performance. Methods/Statistical Analysis: Naïve Bayes Classifier is taken for the study to evaluate its variations in terms of its performance in correctly classified instances and incorrectly classified instances. Pima Indian Type II diabetes dataset is taken for the experimental study. Confusion matrix will be formulated for the performance of Naïve-Bayes Classifier using 10-fold cross validation for each run. The study exhibits the impact of the dimensionality on the performance of Naïve-Bayes Classifier. Findings: The Naïve Bayes classifier classifies the patient records either as diabetes or as non-diabetes using the values of the feature set. It is a probabilistic approach of classifying the patient records into the binary class. It is found that there is an impact on the performance of Naïve Bayes Classifier due to the dimensionality of the feature set it terms of Classification accuracy, number of true positives, true negatives, false positives and false negatives. The incorrect classification is certainly dangerous. Whereas the valid classification facilitates the healthcare systems in terms of planning effective course of treatment which will save the life of the patient. The invalid classification will lead to a wrong diagnosis while formulating the treatment plan and it will lead to loss of life. Hence, the invalid classification in terms of false negative rate is to be viewed very seriously. In this paper, the study shows that there is an impact on the performance of Naïve Bayes Classifier due to the higher dimensionality of the dataset. Application/Improvements: They will be used in medical Informatics for the quality diagnosis and effective treatment planning. The focus on the false positive rate in the classification accuracy of Naïve Bayes Classifier will notably help the healthcare systems to diagnose the patients accurately to save life.

Keywords

Classification Accuracy, Dimensionality Reduction, Machine Learning, Naïve-Bayes Classifier.

Username
Password
Remember me